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Abstract — A bandwidth puzzle was recently proposed to de- 
fend against colluding adversaries in peer-to-peer networks. The 
colluding adversaries do not do actual work but claim to have 
uploaded contents for each other to gain free credits from the 
system. The bandwidth puzzle guarantees that if the adversaries 
can solve the puzzle, they must have spent substantial bandwidth, 
the size of which is comparable to the size of the contents they 
claim to have uploaded for each other. Therefore, the puzzle 
discourages the collusion. In this paper, we study the performance 
of the bandwidth puzzle and give a lower bound on the average 
number of bits the adversaries must receive to be able to solve 
the puzzles with a certain probability. We show that our bound 
is tight in the sense that there exists a strategy to approach this 
lower bound asymptotically within a small factor. The new bound 
gives better security guarantees than the existing bound, and can 
be used to guide better choices of puzzle parameters to improve 
the system performance. 

I. Introduction 

A key problem in peer-to-peer (p2p) based content sharing 
is the incentive for peers to contribute bandwidth to serve other 
peers lfT6l .Without a robust incentive mechanism, peers may 
choose not to upload contents for other peers, causing the entire 
system to fail. In many applications, a peer's contribution is 
measured by the number of bits it uploaded for other peers. 
It is difficult to measure the contribution because peers may 
collude with each other to get free credits. For example, if 
Alice and Bob are friends, Alice, without actually uploading, 
may claim that she has uploaded a certain amount of bits for 
Bob. Bob, when asked about this claim, will attest that it is 
true because he is Alice's friend. Therefore, Alice gets free 
credits. 

With the current Internet infrastructure, such collusions are 
difficult to detect, because the routers do not keep records 
of the traffic. Recently, a bandwidth puzzle scheme has been 
proposed solve this problem |14|. In the bandwidth puzzle 
scheme, a central credit manager, called the verifier, is assumed 
to exist in the network. The verifier issues puzzles to suspected 
nodes, called provers, to verify whether the claimed transac- 
tions are true. To be more specific, when the verifier suspects 
a set of provers for certain transactions, it issues puzzles 
simultaneously to all the involved provers, and asks them to 
send back answers within a time threshold. The puzzle's main 
features are (1) it takes time to solve a puzzle and (2) a puzzle 
can be solved only if the prover has access to the contents. To 



illustrate the basic idea of the puzzle, consider the previous 
simple example with Alice and Bob. The verifier issues two 
puzzles, one to Alice and one to Bob. As Alice did not upload 
the content for Bob, Alice has the content but Bob does not. 
When received the puzzles, Alice can solve hers and send the 
answer to the verifier before the threshold but not Bob. Bob 
also cannot ask help from Alice, because Alice cannot solve 
two puzzles within the threshold. Given this. Bob will fail to 
reply with the answer of the puzzle and the verifier will know 
that the transaction did not take place. 

The bandwidth puzzle is most suited for live video broadcast 
applications, where fresh contents are generated constantly 
1 14 1 . The verifier can naturally reside in the source node of the 
video, and the puzzle is based on the unique content currently 
being broadcast, such that there can be no existing contents 
downloaded earlier to solve the puzzles. The construction of 
bandwidth puzzle is simple and based only on hash functions 
and pseudorandom functions. In lfT4l . the puzzle scheme was 
implemented and incorporated into a p2p video distributing 
system, and was shown to be able to limit collusions signifi- 
cantly. An upper bound was also given for the expected number 
of puzzles that can be solved given the limit of the number of 
bits received among the adversaries. However, the bound is 
"loose in several respects," as stated by the authors, because 
its dominating term is quadratic to the number of adversaries 
such that it deteriorates quickly as the number of adversaries 
increases. In this paper, we give a much improved bound on the 
performance of the puzzle. The new bound gives the average 
number of bits the adversaries must have received if they can 
solve the puzzles with a certain probability. As we will prove, 
the average number of bits the adversaries receive is linear to 
the number of adversaries for all values of adversaries. It is also 
asymptotically tight, in the sense that there exists a strategy 
that achieves this bound asymptotically within a small factor 
The improved bound leads to more relaxed constraints on the 
choice of puzzle parameters, which should in turn improve the 
system performance. 

The rest of this paper is organized as follows. Section |ll] 
describes the construction of the puzzle. Section |lll] gives the 
proof of the new bound. Section |IV] discusses the practical 
puzzle parameters and shows how a simple strategy approaches 
the bound. Section |V] discusses related works. Section |VT] 
concludes the paper 



n 


The number of bits in the content 


k 


The number of indices in an index set 


L 


The number of index sets in a puzzle 


z 


The number of puzzles sent to a prover 


e 


The time threshold to solve the puzzles 



TABLE I 
List of Puzzle Parameters 

II. The Construction 



In this section, we describe the construction of the puzzle. 
The puzzle construction is largely the same as lfT4l except 
one difference: allowing repeated indices in one index set (the 
definition of index set will be given shortly), which simplifies 
the puzzle construction. We first give a high-level overview of 
the puzzle construction as well as introducing some notations. 
The main parameters of the puzzle are listed in Table I] 

A. A High-level Description 

The content being challenged is referred to simply as con- 
tent. There are n bits in the content, each given a unique index. 
An index set is defined as k ordered indices chosen from the n 
indices. Each index set defines a string denoted as str, called 
the true string of this index set, which is obtained by reading 
the bits in the content according to the indices, str can be 
hashed using a hash function denoted as hash, and the output 
is referred to as the hash of the index set. To construct a puzzle, 
the verifier needs L index sets denoted as h,. . . where 
an index set is obtained by randomly choosing the indices, 
allowing repeat. The verifier randomly chooses one index set 
among the L index sets, denoted as /^-, called the answer index 
set. It uses hash to get the hash of denoted as h, which 
is called the hint of the puzzle. The puzzle is basically the L 
index sets and h. When challenged with a puzzle, the prover 
should prove that it knows which index set hashes into h, by 
presenting another hash of If generated by hash function ans. 
The purpose of using ans is to reduce the communication cost, 
as strjj may be long. The verifier may issue z puzzles to the 
prover and the prover has to solve the all puzzles before a time 
threshold 6. 

From a high level, the strengths of the puzzle are (1) a prover 
has to know the content, otherwise it cannot get the true strings 
of the index sets (2) even if the prover knows the content, it 
still needs to spend time to try different index sets until it 
finds an index set with the same hash as the hint, refereed to 
as a confirm event, because the hash function is one-way. In 
practice, the verifier need not generate all index sets; it need 
only generate and find the hash of the answer index set. The 
verifier should not send the L index sets to the prover because 
this requires a large communication cost; instead, the verifier 
and the prover can agree on the same pseudorandom functions 
to generate the index sets and the verifier sends only a key for 
the pseudorandom functions. Therefore, this construction has 
low computation cost and low communication cost. 

As a example, suppose n ~ 8 and the content is 00110101. 
Suppose fc = 4, L = 3, and the three index sets in the puzzle 
are h = {5,3,7,0}, h = {1,2,6,3}, and I3 = {2,3,5,3}. 
Correspondingly, stri = 1110, str2 = 0101 and str^ = 1111. 
Suppose the verifier chooses £ ~ 1. Suppose hash is the simply 
the the parity bit of the string, such that h = 1. The prover 





The number of hash queries allowed, detemiined by 8 


n 


A special oracle for hash and content queiies 
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The maximum number of missed bits 
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A positive number determined by puzzle parameters 



TABLE 2 
List of Notations in the Proof 



receives the the hint and generates the three index sets, and 
finds that only /i has parity bit 1. Suppose ans is simply the 
parity bit of every pair of adjacent bits. The prover presents 
'01 ' which proves that it knows Ii is the answer index set. 

B. Detailed Puzzle Construction 

In the construction, it is assumed that the keys of the 
pseudorandom functions and the output of the hash functions 
are both k bits. In practice, k — 160 suffices. 

Pseudorandom functions are used to generate the index sets. 
A pseudorandom function family {/k} is a family of functions 
parameterized by a secret key. Roughly speaking, once ini- 
tialized by a key, a pseudorandom function generates outputs 
that are indistinguishable from true random outputs. Two 
pseudorandom function families are used: {fj^ : {1, . . . , X} — > 
{0,1}^ and {/|.:{l,...,fc}^ {!,...,«}}. 

Two hash functions are used in the construction, hash and 
ans. hash is used to get the hint. It actually hashes the concate- 
nation of a K-bit key, a number in the range of [1, L], and a k- 
bit string into K-bits: {0, 1}« x {1, . . . , L} x {0, l}*-' {0, 1}". 
To prove the security of the puzzle, hash is modeled as a 
random oracle [16]. The other hash function is ans : {0, 1}*^ — > 
{0, 1}'^. For ans, only collision-resistance is assumed. 

As mentioned earlier, a puzzle consists of the hint h and 
L index-sets. The verifier first randomly picks a K-bit string 
as key Ki. Then it randomly picks a number £ from [l,L] as 
the index of the answer index set. With Ki and £, it generates 
K2 ffCj^i^)- K2 is used as the key for fj^^ to generate the 
indices in the answer index set: = {/^|(1) ■ • ■ /^i(fc)}- 
The verifier then finds str^. It then uses the concatenation of 
Ki, £, and str^ as the input to hash and uses the output as 
h: h ^ hash{Ki,£, str^). Including Ki and £ ensures that the 
results of one puzzle-solving process cannot be used in the 
solving process of another puzzle, regardless of the content, fc, 
and L. The prover can generate index sets in the same way as 
the verifier generates the answer index set, and can compare 
the hash of the index sets with the hint until a confirm is found. 
When the prover finds a confirm upon string stri, it returns 
ans(sfr^). 

III. The Security Bound 

In this section, we derive the new bound for the bandwidth 
puzzle. Although the puzzle is designed to defend against 
colluding adversaries, we begin with the simple case when 
there is only one adversary given only one puzzle, because the 
proof for this simple case can be extended to the case when 
multiple adversaries are given multiple puzzles. 

A. Single Adversary with a Single Puzzle 

Consider a single adversary challenged with one puzzle. 
We begin with assumptions and definitions. Some key proof 
parameters and notations are listed in Table |2] 



2 



1} Assumptions and Definitions: In the proof, we model 
hash and ans as random oracles and refer to them as the hash 
oracle and the answer oracle, respectively. Obtaining a bit in 
the content is also modeled as making a query to the content 
oracle denoted as content. The adversary is given access to 
hash, ans, and content. To model the computational constraint 
of the prover in the limited time 6 allowed to solve the puzzle, 
we assume the number of queries to hash is no more than (/hash- 
To ensure that honest provers can solve the puzzle, ghash > L. 
However, we do not assume any limitations on the number of 
queries to content and ans. We refer a query to content as a 
content query and a query to hash a hash query. We use A to 
denote the algorithm adopted by the adversary. 

In our proof, we define a special oracle, fi, as an oracle that 
answers two kinds of queries, both the content query and the 
hash query. Let B be an algorithm for solving the puzzle, when 
given access to the special oracle Vl and the answer oracle ans. 
If B makes a content query, simply replies with the content 
bit. In addition, it keeps the history of the content queries 
made. We say a hash query to is informed if there are no 
more than V bits missing in the index set and uninformed 
otherwise, where F is a proof parameter much smaller than 
k. \f B makes an informed hash query for I^, Vl replies with 
the hash of I^; otherwise, it returns 0. In addition, if B makes 
more than L hash queries for the puzzle, Vl will not answer 
further hash queries. 

2) Problem Formalization: The questions we seek to an- 
swer is: given ghash, if the adversary has a certain advantage 
in solving the puzzle, how many content queries it must 
make to content on average! In the context of p2p content 
distribution, this is analogous to giving a lower bound on the 
average number of bits a peer must have downloaded if it 
can pass the puzzle challenge with a certain probability. Note 
that we emphasize on the average number of bits because a 
deterministic bound may be trivial: if the adversary happens to 
pick the answer index set in the first attempt of hash queries, 
only k content queries are needed. However, the adversary may 
be lucky once but unlikely to be always lucky. Therefore, if 
challenged with a large number of puzzles, the average number 
of queries it makes to content must be above a certain lower 
bound, which is the bound we seek to establish. 

In an earlier work 1,14] , an upper bound was given on the 
expected number puzzles that can be solved if the adversary 
is allowed ghash hash queries and a certain number of content 
queries. In this work, we remove assumption on the maximum 
number of content queries. With less assumptions, our proof 
is less restrictive and applies to more general cases. The new 
problem is different from the problem studied in [14], and 
new techniques are needed to establish the bound. Note that 
although the adversaries is allowed to download as many bits 
as they wish, they prefer to employ an intelligent algorithm 
to minimize the number of downloaded bits because their 
intention is to use collusion to avoid spending bandwidth. The 
new bound guarantees that, if the adversaries wishes to have a 
certain advantage in solving the puzzles, there exists a lower 
bound on the average number of bits they have to download, 
regardless of the algorithm they adopt. 



3) Proof Sketch: A sketch of our proof is as follows. As it 
is difficult to derive the optimal algorithm the adversary may 
adopt, our proof is "indirect." That is, by using Jl, we introduce 
a simplified environment which is easier to reason about. 
We show that an algorithm can be found in the simplified 
environment with performance close to that of the best algo- 
rithm the adversary may adopt in the real environment. This 
provides a link between the simplified environment and the real 
environment: knowing the bound for the former, the bound for 
the latter is a constant away. We establish the performance 
bound of the optimal algorithm in the simplified environment, 
by showing that to solve the puzzle with certain probability, 
an algorithm must make a certain number of informed hash 
queries to fl and the average number of unique indices in 
the informed queries, i.e., the number of content queries, is 
bounded. 

4) Proof Details: Given any algorithm A the adversaries 
may adopt, we construct an algorithm Ba that employs A 
and implements oracle queries for A. Ba terminates when A 
terminates, and returns what A returns. When A makes a query, 
Ba replies as follows: 



Algorithm 1 Ba answers oracle queries for A 
1: When A makes a query to content, Ba makes the same 

content query to and returns the result to A. 
2: When A makes a query to ans, Ba makes the same query 

to ans and returns the result to A. 
3: When A makes a query to hash for Ig: 

1) Ba checks whether A has made exactly the same 
query before. If yes, it returns the same answer as 
the last time. 

2) Ba checks whether there are no less than V bits in 
li that have not been queried. If yes, it returns a 
random string. 

3) Ba checks whether it has made a hash query for 
before. If no, Ba makes a hash query to il. If confirm 
is obtained upon this query, Ba knows that li is the 
answer index set, and sends content queries il to get 
the remaining bits in 

4) If Ii is not the answer index set, Ba returns a 
random string. 

5) If the string A submitted is the true string of le, Ba 
returns the hash of !(. 

6) Ba returns a random string. 



Let oj{) denote the average number of bits received by an 
algorithm, where the average is taken over the random choices 
of the algorithm and the randomness of the puzzle. We have 

Theorem 3.1: Let Ca be the event that A returns the correct 
answer when A is interacting directly with content, hash and 
ans. Let Cjs^ be the event that Ba returns the correct answer, 
when Ba is interacting with ft and ans. Then, 

P[Cb.]>P[Ca]~^. 

and 
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Proof: In our construction, Ba employs A, and answers 
oracle queries for A. Denote the random process of A when it 
is interacting directly with content, hash and ans as W , and 
denote the random process of A when it is interacting with 
the oracles implemented by 6^ as W' . We prove that W and 
W will progress in the same way statistically with only one 
exception, while the probability of this exception is bounded. 

First, we note that when A makes a query to content or ans, 
Ba. simply gives the query result, therefore the only case needs 
to be considered is when A makes a query to hash. When A 
makes a query for Ig to hash, 

• If there are still no less than V unknown bits in this 
index set, Bj, will simply return a random string, which 
follows the same distribution as the output of the hash 
modeled as a random oracle. If ^ ^ ^, such a query will 
not result in a confirm, and this will have same effect 
on the progress of the algorithm statistically as when A 
is making a query to hash. However, if ^ = ^, it could 
happen that A is making a query with the true string. 
In this case, the exception occurs. That is, W' will not 
terminate, but W will terminate with the correct answer 
to the puzzle. However, the probability of this exception 
is bounded from the above by ^f*^, because if no less 
than V bits are unknown, the probability of making a 
hash query with the true string is no more than ^^r-. 

• If Ba has made enough content queries for this index 
set, Ba. checks whether it has made hash query for this 
index set before. If no, Ba makes the hash query, and if 
a confirm is obtained, Ba knows that this is the answer 
index set and get the possible remaining bits in it; other- 
wise Ba knows that it is not the answer index set. If le is 
not the answer index set, Ba will simply return a random 
string, which will have the same effect statistically on 
the progress of A as when A is interacting with hash. 
If Ii is the answer index set, Ba checks whether A is 
submitting the true string, and returns the true hash if yes 
and a random string otherwise. This, clearly, also has the 
same effect statistically of the progress of A as when A 
is interacting with hash. 

From the above discussion, we can see that P [Ce^] is 
no less than P [Ca\ minus the probability of the exception. 
Therefore, the first half of the theorem is proved. We can also 
see that if the exception occurs, Ba makes at most Lk more 
content queries than A. If the exception does not occur, Ba 
receives at most V bits than A it encapsulates, and therefore 
at most V bits more than A on average when A is interacting 
directly with content, hash and ans. ■ 

Theorem 13.11 allows us to establish a connection between 
the "real" puzzle solver and the puzzle solver interacting with 
Q,. The advantage of introducing VL is that a good algorithm 
will not send any uninformed queries to 51, because it will get 
no information from such queries. If there is a bound on the 
number of hash queries, which are all informed, it is possible 
to establish a lower bound on the number of unique indices 
involved in such queries, with which the lower bound of the 
puzzle can be established. It is difficult to estabhsh such bound 



based on hash directly because hash answers any queries. 
Although some queries are "more informed" than others, all 
queries have non-zero probabilities to get a confirm. The next 
theorem establishes the lower bound on the expected number 
of informed hash queries to achieve a given advantage by an 
optimal algorithm interacting with VL. 

Theorem 3.2: Suppose B is an optimal algorithm for solving 
the puzzle when interacting with O. If S solves the puzzle with 
probability no less than e, on average, the number of informed 
hash queries it makes is no less than Ifzijl^. 

Proof: Let correct denote the event that B returns the 
correct answer Note that 

P [correct] = P [correct | confirm] P [confirm] 

-hP [correct | ^confirm] P [^confirm] 
= P [confirm] 

+P [correct | ^confirm] P [^confirm] 

< P [confirm] + P [correct | ^confirm] 

< P [confirm] + ^ 

Note that P [correct | ^confirm] < because if the algorithm 
returns the correct answer, it must have the true string of the 
answer index set, since ans is collision-resistant. If a confirm 
was not obtained, the answer index set is missing no less than 
V bits, since otherwise an optimal algorithm should make 
query which will result in a confirm. Therefore, the probability 
that the algorithm can obtain the true string of the answer index 
set is no more than j^. Note that hash queries to ft will not 
help in the guessing of the true string, because is aware 
of the number of missing bits and will not reply with any 
information. Therefore, any algorithm that achieves advantage 
e in solving the puzzle must have an advantage of no less than 
e ~ 2V to get confirm. 

Let Pi be the probability that B makes no hash query and 
let Pi be the probability that B stops making hash queries 
after all previous queries (queries 1 to i — 1) failed to generate 
a confirm for 2 < i < L. Consider the probability that a 
confirm is obtained upon the ith query. For a given set of 
Pi, P2, ■ . ■ , Pl, because £ is picked at random, the probability 
is 



(l-A)^(l 



P?, 



L-2 



1 



Therefore, the probability that the algorithm can get a confirm 

is 



i=i j=i 



- P 



01 



The event that exactly i queries are made occurs when a 
confirm was obtained upon the ith query, or when all first i 
queries failed to obtain the confirm and the algorithm decides 
to stop making queries. The probability is thus 



1 



L 



j=i 



L 



4 



Note that Pl+i is not previously defined. However, as — 
when i — L, for convenience, we can use the same expression 
for all 1 < i < L for any arbitrary value of Pl+i- To derive 
the lower bound, we therefore need to solve the problem of 
minimizing 



subject to constraint that 



L 

j=i j=i 



and 



< P, < 1. 



To solve the problem, we let iji — Y[j=ii^ ~ Pj) ^^'^ i^ote 



that Pi+i = l- Therefore 



1 L 



-R- 



L 



1 ^ 

1=1 



L-l 



(L - i + - E(^ ~ *)'7i+i« 



number of samples needed to get the ith unique index. Clearly, 
P [Zi = 1] = 1. In general, note that Zi follows the geometric 
distribution, i.e., 

p[z. = j] = i^y-^^^^^. 



Let Pi = ^i^^, we have E [Zi] = j-, and Var [Z,] = 
Also, {Zi}i are independent of each other. Define 5^ = 
and note that P [Y < fi] = P[S^> c]. Therefore, 
we will focus on finding P [5^ > c] . 

Define Z'^ ^ Zi - E [Z,]. Let S'^ = ^^^^ Z,' and note that 



P [^M > c] - P 



As {Zl}i are indepen- 



dent random variables with zero mean, due to the Central Limit 
Theorem, S'^ approximately follows the Gaussian distribution 
with zero mean and variance Y^^ , ^—P-. Note that as n ^ cx) 
and c — > oo, ^ oo, therefore. 



1 



EM j_ 
2—1 pj 

EM 1-Pi 
*=1 p^ 



where Q() is the Gaussian Error Integral. To simplify the 
result, note that 



El n 
n 



< n In ■ 



and 



i=i 

We therefore consider a new problem as minimizing 
j^[^f^i{L — i + l)r]i] subject to constraint that X]f=i '7i — 
L{e — jy), < rji < 1, f?i+i < rji. The optimal value for 
the newly defined problem must be no more than that of the 
original problem, because any valid assignment of {Pi}i gives 
a valid assignment of {rii}i. To achieve the optimal value of 
the new problem, note that if i < j, the coefficient of r/i is 
more than rjj in the objective function, therefore, to minimize 
the objective function, we should reduce rji and increase 
Considering that {rii}i is nondecreasing, the optimal is 
achieved when all r]i are set to the same value (e — j^), and 

(e_ 1 )(L + 1) 

the optimal value is g 

Based on Theorem 13.21 any algorithm with an advantage 
of e must make no less than certain number of informed 
hash queries to 57 on average. We next derive the number of 
unique indices in a given number index sets. We first need the 
following lemma. 

Lemma 3.3: Suppose c indices are randomly picked among 
n indices, with repeat. Let Y be the random number denoting 
the number of unique indices among c indices. Let fi = n(l 
S)[l-{1--Y] for a constant < 5 < 1 and 77 = 



c— n In - 



We have 



P[Y <fi]< 



when n — > 00 and c — > 00. 

Proof: Consider the process when indices are randomly 
taken from n indices. Let Zi be random number denoting the 



1=1 



E 



1 - 



=0 y n ■> 



E 

i'=0 



{n - 



4^2 



n 



< 



1 



i'=0 



(n — i'Y' 



i'=0 



{n - 



1 



n — ji 

Applying these bounds, we have 



1, 



n In ■ 



M 



/2'KX 



El. 



The proof completes since Q{x) < 

Note that according to the well-known coupon collector 
problem, n[l — (1 — i)^] is actually the average number of 
unique indices among c indices, and 6 determines how far /i 
deviates from this value. This lemma establishes the bound of 
the probability that the number of unique indices is less than 
1—5 fraction of the average. 

Let Yf denote the random variable of the minimum number 
of unique indices in s index sets among all possible choices of 
s index sets picked from J index sets. Let /is = n{l — S)[l — 



(1 ~ i) 
have 



and Tis 



sk—n In - 



due to Lemma 13.31 we 



P[Ys' 



< 



2TTr]s 



5 



as n — > oo and sfc ^ oo. This is because event F/ < fig 
happens if one combination of s index sets have no more than 
fig unique indices, which happens with probabiUty as given in 
Lemma 13.31 and the total number of combinations to pick s 
index sets is less than J'^. 

Considering practical puzzles, we note that n is very large, 
e.g., 10^, as well as k, e.g., 10^. For any s > 1, as n — oo and 
sk oo, (1 — i)*'' approaches We also pick puzzle 

parameters as well as S, such that P [Y/ < fig is negligibly 
small for conceivable values of J; see Section |IV] for details 
which has been confirmed by numerical analysis. Therefore, 
we guarantee that for the puzzles we use, the number of unique 
indices in any s index sets is no less than {\~5)n{l~ e^*''/") 
with overwhelming probability. 

The next lemma is needed in determining the average 
number of unique indices in a certain average number of index 
sets. 

Lemma 3.4: Consider a linear programing problem of maxi- 



mizing J2^=0 

p _ 



-id 



subject to the constraint that J2i=o 



Ei=o-P» = 7. > 0, where < d, < 7 < 1 and 
< < 7^. Denote the optimal value of the objective 
function as Fl{I3,j), we have 

(I _ g-Ld\ 
FdPn)^!-- 1 h 

That is, to achieve the optimal value is to let Pq = 7 ^ -f and 
Pl = f , while let P, = for < i < L. 

Proof: We will induction on L. To begin with, consider 
when L — 1. In this case, due to the constraints, Pq and Pi can 
be uniquely determined as Po = 7 — /? and Pi = /?. Therefore, 

Pi(/3,7) =7-/3 + /3e-^ 

and the lemma is true. Suppose the lemma is true till j. To 

get Fj+i(/9, 7), suppose Pj+i =7 — 7', where 7' < 7. Given 
this, P' = Y/-^^ P^i = /3 - (7 - 7')(j + 1), and therefore, 

P,+i(/?,7) = P,-(/3',7') + (7-7')e-(^+^''^ 



7 



3 



-P' + (7 - 7')e~^^+'^' 



(l-e-^'^)(j + l) 



7 

1 - e-^'^ 

i 



[/3-7(j + l)]+7e-^^+'^" 



Regarding 7' as a variable, its coefficient is 

j 

which is no more than 0. To see this, consider function 

/(.).l-ii^^2!)(l±l)_e-(.>^)^ 

i 

Note that /(O) = 0, and f'{x) < when a; > 0. Therefore, 
to maximize the objective function, 7' should be as small as 
possible. Note that jP^ = /3-(j+l)7+(j+l)7' and Po+Pj = 
7'. Therefore, 

M) — : ■ 



Since Pq < 7', we have 7' > 7— . Therefore, Pj+i 7) = 

Suppose we randomly pick index sets from a total L index 
sets when the average number of picked index sets is /3. Let 
Pi denote the probability that i index sets are picked, where 
{) < i < L. We now give the lower bound of the average 
number of unique indices in the index sets picked, denoted as 
U{P). We have 



L 

1=0 



S)n{l - e-*'=/")P,. 



Therefore, to derive the lower bound is to maximize 
St=o subject to the constraints X^i^o ^ P' 



Based on Lemma 1341 we immediately have 
Lemma 3.5: If the average number of picked index sets is 
13, then 

^^^^ ^ (1 - 6)nil - e-^^/")/3 
L 

where S is a parameter determined by the puzzle parameters. 

We may finally assemble the parts together Suppose A has 
an advantage of a in solving the puzzle when receiving uj{A) 
bits on average. Based on Theorem 13.11 has an advantage 
of no less than a — ^ir- while receiving no more than uj{A) + 
"^^'^i':"'' + V bits on average. Based on Theorem 13.21 to achieve 
an advantage of at least a — ^t*, an algorithm must make at 

least - — hash queries. Based on Lemma |331 also 

considering that B needs to receive only k — V + I bits per 

index set, B receives at least U{- — ^^ZIzlL-LJ.) — P( V" — 1) 
bits on average. Therefore, 

Theorem 3.6: Suppose A solves the puzzle with probability 
no less than a. Let uj{A) denote the average number of 
received bits. We have 



to (A) > 



(1 - d)n{l - e-^''/"){a - 2!i|^)(P + 1) 
2L 

^LiV - I) - - V 



where ghash, V , and (5 are constants determined by the puzzle 
parameters. 

B. Multiple Adversaries with Multiple Puzzles 

We next consider the more complicated case when multiple 
adversaries are required to solve multiple puzzles. Suppose 
there are A adversaries, and the number of puzzles they attempt 
to solve is P. Note that P is greater than A when z > 1. 

1) Proof Sketch: The proof uses the same idea as the single 
adversary case. Basically, we extend to handle multiple 
adversaries, where fl gives correct answer to a hash query from 
an adversary only if the number of bits the adversary received 
for the index set is greater than k— V, regardless of the number 
of bits other adversaries received. With similar arguments as 
the single adversary case, we can establish the relationship 
between the algorithm performance when interacting with f2 
and with the real oracles. We also obtain the average number of 
informed queries the adversaries must make to achieve certain 
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advantages when interacting with fl. The bound is established 
after solving several optimization problems. 

2) Proof Details: Suppose the adversaries run an algorithm 
A that solves the P puzzles with probability a while receiving 
uj{A) bits on average. We wish to bound from below uj{A) for 
a given a. We extend the definition of fl and let it remember the 
content queries from each adversary. We use 1^ to denote index 
set £ in puzzle p where 1 < p < P. If an adversary makes a 
hash query for while this adversary has made content query 
for more than k — V bits in If, fl replies with the hash of If, 
otherwise, it returns 0. In addition, if B makes more than L 
hash queries for a particular puzzle, fl will not answer further 
hash queries for this puzzle. 

Similar to the single puzzle case, given an algorithm A 
for solving the puzzles, we construct an algorithm B_a em- 
ploying A denoted as B_a- Bj, terminates when A terminates, 
and returns what A returns. Algorithm |2] describes how Ba 
implements oracle queries for A which is very similar to the 
single adversary case. 

Algorithm 2 Ba answers oracle queries for A 
1: When A makes a query to content, Ba makes the same 

content query to il and gives the result to A. 
2: When A makes a query to ans, Ba makes the same query 

to ans and gives the result to A. 
3: When A makes a query for If to hash at adversary v: 

1) Ba checks whether adversary v has made exactly 
the same query before. If yes, it returns the same 
answer 

2) Ba checks whether there are no less than V bits in 
If that have not been queried for at adversary v, and 
if this is true, it returns a random string. 

3) Ba checks if it has made a hash query for If, if no, 
it makes a hash query to fl. If confirm is obtained 
upon this query, Ba knows If is the answer index 
set of puzzle p. Ba sends content queries fl to get 
the remaining bits in If. 

4) If If is not the answer index set of puzzle p, Ba 
returns a random string. 

5) If the string A submitted is the true string of If, Ba 
returns the hash of If. 

6) Ba returns a random string. 



With very similar arguments as in Theorem 13. II we can have 

Theorem 3.7: Let Ca be the event that A returns the correct 
answers when it is interacting directly with content, hash and 
ans. Let Cg^ be the event that Ba returns the correct answers, 
when it is interacting with Vl and ans. Then, 



P [CbA > P [Ca] 



and 



uj[Ba] < uj[A] 



2^ 

PLkAquash 
2V ^ 



VP. 



probability that B solves all puzzles is no less than e, the 
probability that an individual puzzle is solved is no less than e. 
Based on Theorem 13.21 if a puzzle is solved with probability 
no less than e, the average number of hash queries made for 
this puzzle is no less than IfZil^. There are P puzzles, 
and we obtain the following theorem due to the linearity of 
expectation. 

Theorem 3.8: If the probability that B solves all puzzles is 
no less than e, on average, the number of informed hash queries 
is no less than 

P(e~^){L + l) 



Next, we wish to bound from below the number of unique 
indices if the adversaries collectively have to make T informed 
hash queries. Here we define the unique indices at adversary v 
as the total number of unique indices in the index sets that it 
made hash queries for, and denote it as u^,. The total number 
of unique indices is defined as X^^Li Recall that fl will not 
answer a hash query from adversary v if adversary v has not 
received enough number bits for this index set. Note that ft will 
not answer the hash query even if there exists another adversary 
knowing enough bits for this index set. In other words, content 
queries made at one adversary do not count as content queries 
at other adversaries, which is the one of the key differences 
between the single adversary case and the multiple adversary 
case. B may be able to assign hash queries to the adversaries 
intelligently, such that X^^Li minimized. For instance, if 
two index sets share a large number of indices, they should be 
assigned to the same adversary. Nevertheless, we have 

Lemma 3.9: If the number of informed queries made by B 
is T, 



A 



> (1 - S)n[t{l - e-«''»'''^/") + (1 - e^^'^-*-"*)'^'/")], 



Let B denote the optimal algorithm for solving the puzzles 
when the algorithm is interacting with fi. Note that if the 



where t =\ T/ghash |~ and \ x \~ denotes the largest integer 
no more than x. 

Proof: An adversary may make no more than q^ash 
queries. Suppose the number of hash queries made by adver- 
sary u is s„. We have 

A A 
v=l v—1 

Therefore, to minimize '^v is to maximize 

e^^"*^/" subject to the constraints that J2t=i = T 
and < St, < qhash- 

We claim that the optimal is achieved when Si is set to be 
ghash for 1 < i <| T/qy,ash \~, which we show by induction 
on the number of adversaries. First consider when A = 2. If 
T < (?hash, we claim that e^*''^/" is maximized when 

si — T and S2 ~ 0, which is because for any valid si and §2, 

= (l_e-''i'^/")(l-e-"^'''/") 
> 0. 
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Similarly, if (/hash < T < 2(7hash, J2t=i e~*'*'/" is maximized The optimal of relaxed problem will be no more than the 
when si = (/hash and S2 —T — ghash, which is because for any optimal of the original problem. To solve the relaxed problem, 
valid si and S2, let 

_ (1 — g(-Sl+I^-9liash)fc/n^^g(-T+(}|,ash)fc/" _ g-S2'£/n^ 

> 0. 



j=(s-l)?hash 



for 1 < s < C. Clearly, * = ^f^^ 
Therefore our claim is true for A = 2. Suppose our claim ^^^^^^^ ^ gj^^^ ^j^^^^ 

is true for A = j. For ^ = j + 1, suppose in the optimal ^c^^ ^ ^ ^^^^ ^C^^ ^ ^ ^j^^ ^^^^ 

assignment, s^+i = 0. Then, our claim is true based on the ^^.^ ^^u^^ ^^^^-^^^ jf p^^^j^l^ ^p^^ ^^^1^ ^1^^^ 

induction hypothesis. If in the optimal assignment, sj^i > 0, 

we prove that in the optimal assignment = ghash for all s'jiw^i 

1 < * < i> therefore our claim is still true. This because if — Is, 

Si < ghash for some i, we can increase si while decreasing i=(s-i)<?hash 

Sj+i. Using similar arguments as for case when A = 2, this ^jj^j 

will increase the objective function, thus violating the fact that sghash-i 
the assignment is optimal. 

Similar to the single adversary case, if the average number 2=(s-i)<?hash 

of informed hash queries B makes is /3, we need to bound from all 1 < s < C. Note that {7,}, and {Ps}s are feasible if 

below the average number of unique indices in the involved ^^j^ 
index sets, denoted as U{/3). 

Lemma 3.10: Consider there are A adversaries given P (s — l)ghash7s < A < (sghash ^ l)7s- 
puzzles, if the average number of informed hash queries is 

P When {^s}s and {/3s}s are given and are feasible, to 

(1 — S)nf3{l — e"*'""'''^/") minimize ^1* is to minimize each individual vj/^. Note that 

ghash = 7^[(s-l)(l-e-*ashd) + l] 

Proof: Denote the probability that there are i queries as _ p.g-[i-«hash(s-i)]<i 

Pi, where < i < PL. For notational simplicity, in this proof, 
we let d — k / n. Based on Lemma [l!9l we want to bound from 
below 



j=(s-l)ghash 

= 7,[(s-l)(l-e-*-^'*) + l] 

9hash — 1 

_ ST p 

p.J^^. _ J^^^J^ _ g-'Ihashd-J _|_ ^ _ g-l'-Qhashlti-ljlrf^l ^qhBsh{s-l) + hti , 

^ — ^ h—0 

i=0 

under the constraints that ^^ere h = i- qu,su{s - 1). Note that if 

PL srjhash-l 
2=0 i=(s-l) 51,331, 



PL 



then 



i^o ghash-l 

h=0 

where ti is an integer such that [ti — Ijghash < * < ^i^hash- To 

solve this problem, suppose C is the minimum integer satis- denote the minimum value of 1-, for given 7, and /3, as 



fying PL < Cghash - 1, we relax the problem to minimizing °- Applying Lemma |14l 

Cg^ash-l ^'2'=^'^= = 7,4(s-l)(l-e"*ashd)^l]_^^ 

^ p^[(i^_l)(l_g-9has.d)_^(-^_g-[«-,.ash(t.-l)]<i)] 1 _ e-(*ash-l)d 

i=0 +[ ^ ] [A - (S - I)gha5h7s 

(?hash — i 

under the same constraints that „ „-(ghash-i)d 

p ■ - « ^^^^^ ' 

2^ ^tl' - I i_ g-(ghash-l)d 

'=0 7] + [ ] ]Ps 

V P^^l, Let 



- 0- ghash - 1 ghash - 1 
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and 



we have 



I _ g-(9hash-l) 

^hash — 1 



-'-^^ = b(3 + aY,is - Ihs 

s=l s=l 



We also note that a < 0, which is because function 



-(ghash-i)a; 



1 



^hash ~ 1 '/hash — 1 

is when a; = 0, while f'{x) < for x > 0. Therefore, 
finding the minimum value of is equivalent to finding a 
set of feasible {■fs}s and {(3s}s such that ~ l)7s 

maximized. 

We consider the problem of maximizing 

c 

subject to the constraints that 

(s - I)gha5h7s < 13s < (sghash - l)7s. 

c 

8=1 

c 

where 0<7<land0</3< 7(Cghash - !)■ Denote the 
maximum value of W as W*. We claim that 

. If (C - l)ghash7 <I3,W*^ 7(C - 1) and the optimal 
is achieved when jc — li — /3, while 7s = and 

/3s = for 1 < s < C; 



is achieved when 7i = 7 — 



If (C - I)gha5h7 >I3,W* = and the optimal value 

/?! = 0, 7c = 



/3 



(C-l)lJhash ' 



(C-lWash 

s <C. 



Pc = while 7s = and ;9s = for 1 < 



To show this, we use induction on C. First, when C = 2, 
W = 72. Note that 

• If '7ha5h7 < P, we can let 72 —^,(32 — P, while 71 = 
and /3i — 0, in which case 72 is maximized, while all 
constraints are satisfied; 

• If '7hash7 > P, note that for any given /32, 72 < ^ < 

When /3 < ghash7' we may let 71 = 7- /3i = 0, 
72 = 02 = B, such that all constraints are satisfied, 

Qhash 

while 72 is maximized. 
Therefore, our claim is true when C = 2. Suppose the claim 
is true till C = j. When C = j + I, 

• If jgha5h7 < we may let 7^+1 = 7, = l3, and let 
7s = and /3s = for 1 < s < j, such that all constraints 
are satisfied. In this case, W — jj. Since W < j-f, we 
have W* — jj. 

• If j'?hash7 > 13, suppose some < 7' < 7, < /?' < /3 
are given that also satisfy 

(7 - 7')i9hash < (/3 - /?') < (7 - + l)9hash - 1] 



and 

p' < yo^hash - 1). 

We can let J^l^^ 7s = 7' and J^l^^ (3, = P' . 
- If (j — l)(?hash7' < /3', based on the induction 
hypothesis, the maximum value of X]s=i(* ~ '^)ls 
is [j — 1)7', and hence 



W < — 7'. 

(7-7')j'?hash< (/3-/3'), 

7 > 7 : h . 

jghash jghash 

As (j - I)(?ha5h7' < we have 



Because 



we have 



1] 



Therefore, 



W < 



9hash 



- If [j — l)(7hash7' > P' , based on the induction 
hypothesis, the maximum value of X]s=i('^ ~ l)7s 



is and hence 

9hash 



w < 



(Zhash 



+ (7 - 



Since (7 — 7')? < , we have 



/3 



7j+i 



jghash 

1 < s < .7. Therefore, 



(Zhash 

/3, while 7s = and /?s = for 



Note that W achieves when 71 = 7 



_§_ 

"Jhash ■ 

Note that actually, in the first case when jgha5h7 < jl < 
therefore we also have W* < Hence, 

9hash ~ "Jhash 



^ >bl3 + a- 



9hash (/hash 

which completes our proof. ■ 
Similar to single puzzle case, we may now put things to- 
gether Suppose A has an advantage of a in solving the puzzles 
when receiving uj{A) bits on average. Based on Theorem 

while 



2'' 



13.71 B_A has an advantage of no less than <t 
receiving no more than uj{A) + ^^^^^^^^^+ VP bits on average. 
Based on Theorem 13.81 to achieve an advantage of at least 



-, any algorithm must make at least 



hash queries. Based on Lemma [3.101 also considering that B 
needs to receive only k — V + 1 bits per index set, B receives 

P[f7— -^^^^^^r^— ) ( L + 1 ) 

at least U{ ) — PL{V — 1) bits on average. 

Therefore, 

Theorem 3.11: Suppose A adversaries are challenged with 
P puzzles. Suppose A solves the puzzle with probability no 
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less than a and let ujiA) denote the average number of received 
bits. We have 

^ (l-5)nP(CT-^!ft±i)(L + l)(l-e-«''»^'=/") 

~ 2qhash 

where ghash, ^ , and 8 are constants determined by the puzzle 
parameters. 

IV. Discussions 

In this section we discuss the bound and its practical 
implications. We begin by considering a simple strategy the 
adversaries may adopt to be compared with the bound. 

A. A Simple Adversary Strategy 

We note that there exists a simple strategy the adversaries 
may adopt to be compared with the bound. In this strategy, 
when challenged with the puzzles, the adversaries flip a coin 
and decide whether to attempt to solve the puzzles. They 
attempt with probability cr; otherwise they simply ignore the 
puzzles. If they decide to solve the puzzles, the adversities 
select ^^'"^^^ members, and let each of them get the entire 
content. Each of the chosen adversaries makes qhash hash 
queries allowed for them. For each puzzle, the adversaries 
make hash queries for the index sets one by one until a confirm 
is obtained. 

We now analyze the performance of this strategy. We argue 
that the adversaries can solve the puzzles with probability 
close to 1 if they decide to attempt, hence their advantage 
is a. Note that to get a confirm for puzzle according to 
this strategy, the number of hash queries follows a uniform 
distribution in [1, L] and is independent of other puzzles. The 
total number of hash queries is a random variable with mean 
-^^^^^ti^. As the number of puzzles increases, the distribution 
of this variable approaches a Gaussian distribution centered 
around the mean with decreasing variance. Therefore, if the 
adversaries can make ^'^'^^^^ hash queries, the probability 
that they can solve the puzzles asymptotically approaches 1. 
Note that this is possible because there are selected 
adversaries, each making g^ash queries. According this strategy, 
the average number of bits downloaded is ^nEiL+H 

B. Puzzle Parameter Space 

As can be observed in Theorem 13. Ill the dominating factor 
in the number of bits is roughly ^nEiL+H^ if the following 
conditions are satisfied: 

1) (5 is small comparing to 1, 

2) e^*''*'"'/" is small comparing to 1, 

3) 2^ is much larger than Aghash. 

4) 2^ is no less than fcylqhash, 

5) V is much smaller than k. 

If these conditions are satisfied, the bound approaches the 
actual number of bits downloaded by the simple adversary 
strategy above, therefore is tight. 

We show that there are a wide range of values of L, k, z, n 
and (/hash satisfying these conditions, for which the bound can 
be apphed to provide security guarantees. Note that P = Az. 




Number of Adversaries 

(a) 




Number of Adversaries 

(b) 



Fig. 1 . Comparison of the average number of bits downloaded by the simple 
strategy and the bound when a = 1, k = 10*, Qhash = 4n/fc and Lz = 
(?hash/2. (a). n = 10^. (b). n = lO*. 

In the following we use an example to illustrate the choice of 
parameters when A < 10^; the parameters can be similarly 
determined for other values of A. Concerning the conditions, 

• For Condition 1, we note that S should be as small as 
possible, provided that the probability that the number 
of unique indices in any s index sets is less than (1 — 

is negligibly small. We find numerically 
that when A < 10^ for k > 10^, zL < 10^, n > lO'^, if 
S = 0.1, this probability for any s is below 10^^^. 

• Condition 2 can be considered as satisfied when quashk > 
An, noting that er"^ = 0.018. 

. When A < 10'^, V can be set to be 60. Condition 3 
is satisfied when qhash < 10^. Condition 4 is satisfied if 
k < 10^. Condition 5 is satisfied if A: > 10". 

The above discussions give the range of the puzzle parame- 
ters. Basically, if we let S — OA and V = 60, we only require 
n > 10^, 10^ > /c > 10-*, 10^ > Lz, quashk > 4n, 10^ > ghash, 
when A < 10^. 

Note that k should be set to its lower bound lO", because 
a larger value of k results in a heavier load of the verifier 
ghash should be no less than Lz to ensure an honest prover 
can solve the puzzles. Figure [T] shows the average number of 
bits needed by the simple strategy and the lower bound as 
a function of the number of adversaries for different content 
sizes, when a — 1, ghash — '^n/k and Lz = ghash/2. We 
can see that they differ only by a small constant factor We 
have tested other parameters satisfying the constraints and the 
results show similar trends. 

C. Puzzle Parameters in Practice 

We also note that the parameter space is not restrictive 
in practice. Considering the speed of modem communication 
networks, a reasonable rate to challenge the prover should be 



10 



machine 


CPU 


SHAl 


AES 


pc3000 


3.0GHz 64-bit 


202165 


4059157 


pc2000 


2.()GHz 


71016 


2605490 


pc850 


850MHz 


39151 


1086667 


pc6()0 


600MHz 


29064 


789624 



TABLE 3 

SHA-1 AND AES FUNCTION CALLS EXECUTED IN ONE SECOND. 

once at least every 1MB of data, i.e., when n is at least around 
10^. There is also no obstacle to set k to be 10^ . Concerning 9 
which determines (7hash> note that the puzzles should be solved 
in a reasonable amount of time to reduce the load of the prover, 
but the time should also be non-trivial to account for random 
fluctuations of network latency. Therefore a reasonable value 
of 9 should be in the order of several seconds. Given these 
choices of n, k, and 9, z and L should satisfy three conditions, 
according to our earlier discussions: (1) kL < 10®, (2) kLz 
should be no less than, for example, 2n, and (3) the time to 
make kL hash queries should be several seconds but no more 
than 6. 

Note that there are two time consuming tasks when making 
hash queries, which are the hash function call and the gen- 
eration of the random indices. The choices of hash function 
and random number generator have been discussed in [14] ■ 
Basically, secure hash functions such as SHA-1 can be used 
as the hash function and block ciphers such as AES can be 
used to generate the random indices. The optimization of the 
puzzle implementation is out of the scope of this paper due to 
the limit of space. We here show the speed of several machines 
in Emulab [IJ when executing the SHA-1 hash and the AES 
encryption in the Openssl library (|2], summarized in Table |3] 
when the input to SHA-1 is 10^ bits and the AES is 128 bits. 
If n ~ 10^, k — 10'' and 9 — 3sec, the results indicate that on 
modem mainstream machines such as pc3000 and pc2000, (1) 
it is not possible to make more than 10® hash queries within 
9, (2) it is possible to make enough number of hash queries 
within 9 such that kLz > 2n after optimizations in random 
index generation, and (3) when kLz > 2n, solving the puzzles 
will take time in the order of seconds. 

V. Related Work 

Using puzzles has been proposed (e.g., in ifTol . lfT2l . 13], 
||9l , lISl) to defend against email spamming or denial of service 
attacks. In these schemes, the clients are required to spend 
time to solve puzzles before getting access to the service. 
The purpose of the bandwidth puzzle is to verify whether 
the claimed content transactions took place, where the ability 
to solve the puzzles is tied to the amount contents actually 
downloaded. As such, the construction of the bandwidth puzzle 
is different from existing puzzles. 

Proofs of data possession (PDP) (e.g., E), [TTI, fsl) and 
Proofs of reti-ievabihty (POR) (e.g., JTS], Q, |[T5J) have been 
proposed to allow a client to verify whether the data has 
been modified in a remote store. As discussed in (14], the 
key differences between PDP/POR schemes and the bandwidth 
puzzle include the following. First, PDP/POR assumes a single 
verifier and prover, while the bandwidth puzzle considers one 
verifier with many potentially colluding provers. Second, the 



bandwidth puzzle has low computational cost at the verifier, 
which is desirable in the case when one verifier has to handle 
many provers, while the existing PDP/POR schemes may incur 
heavy computational cost at the verifier The proof techniques 
for PDP/POR schemes are also different from the techniques 
used in this paper, because collusion is not considered in 
existing PDP/POR schemes. 

The bandwidth puzzle was first proposed in [141 along with 
a detailed performance evaluation based on an implemented 
p2p streaming system and a larger simulated p2p network. 
The results show that the bandwidth puzzle can effectively 
improve the performance of the honest users in the presence 
of colluding adversaries. A bound was also given in [141 on 
the expected number of puzzles solved when the adversaries 
can make no more than a certain number of hash queries 
and download no more than a certain number of content 
bits. The purpose of this work is to derive a bound on the 
average number of bits downloaded, when the adversaries can 
make no more than a certain number of hash queries but can 
download as many content bits as they wish. Therefore the 
problem studied in this work is different from that in llT4l . 
Our analysis show that the new bound is asymptotically tight 
for all numbers of adversaries, while the bound given in llT4l 
deteriorates quickly as the number of adversaries increases, 
and the largest number of adversaries used in [,14 j is 50 when 
evaluating the bound. As discussed earlier, it is not difficult to 
find puzzle parameters satisfying the requirements of the new 
bound. The bound given in [14il requires much more restrictive 
choices of parameters. For instance, the suggested values of k 
and L are in'^/^° and -^^'i'l/ioo^ respectively (it was assumed 
that z = 1 in liT4l ). A consequence is that the limit on number 
of bits downloaded by the adversaries must be significantly 
smaller than n for the bound to give satisfactory results. 

VI. Conclusions 

In this paper, we proved a new bound on the performance 
of the bandwidth puzzle which has been proposed to de- 
fend against colluding adversaries in p2p content distribution 
networks. Our proof is based on reduction, and gives the 
lower bound of the average number of downloaded bits to 
achieve a certain advantage by the adversaries. The bound is 
asymptotically tight in the sense that it is a small fraction 
away from the average number of bits downloaded when 
following a simple strategy. The new bound is a significant 
improvement over the existing bound which was derived under 
more restrictive conditions and much looser The improved 
bound can be used to guide the choice of puzzle parameters 
to improve the performance of practical systems. 
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